Detecting temporal and spatial malaria patterns from first antenatal care visits

Pregnant women attending first antenatal care (ANC) visits represent a promising malaria surveillance target in Sub-Saharan Africa. Here we assessed the spatio-temporal relationship between malaria at ANC (n=6,471), in children at the community (n=9,362) and at health facilities (n=15,467) in southern Mozambique (2016–2019). ANC P. falciparum rates detected by quantitative polymerase chain reaction mirrored rates in children, regardless of gravidity and HIV status (Pearson correlation coefficient [PCC]>0.8, χ2<1.1), with a 2–3 months lag. Only at rapid diagnostic test detection limits at moderate-to-high transmission, multigravidae showed lower rates than children (PCC=0.61, 95%CI[−0.12–0.94]). Seroprevalence against the pregnancy-specific antigen VAR2CSA reflected declining malaria trends (PCC=0.74, 95%CI[0.24–0.77]). 80% (12/15) of hotspots detected from health facility data using a novel hotspot detector, EpiFRIenDs, were also identified with ANC data. The results show that ANC-based malaria surveillance offers contemporary information on temporal trends and the geographic distribution of malaria burden in the community.


Introduction
Surveillance is key to inform optimal and equitable resource allocation for malaria control and elimination 1 . Estimating malaria trends from clinical cases at health facilities remains challenging due to differences in care-seeking behaviour, unknown denominator populations, and asymptomatic infections 2 .
These biases are minimised in nationally-representative cross-sectional surveys, but due to their high costs and complex logistics, they are typically only conducted every 2-3 years 3 . Pregnant women attending a rst antenatal care (ANC) visit have been proposed as a potential convenience group for surveillance of malaria and other infectious diseases [3][4][5] .
In sub-Saharan Africa, 79% of pregnant women attend at least one ANC visit 6 , offering a good representation of the population. Since visits are unrelated to illness, malaria testing is not biased by care-seeking behaviour or testing decisions, and captures asymptomatic infections 7 . A meta-analysis of pooled prevalence data from Sub-Saharan Africa found a strong correlation between malaria burden in pregnant women and children, with lower rates in the later but with large heterogeneity between studies 8 . Less heterogeneity was found for low-prevalence settings (prevalence < 5%), and less difference between women and children was found when restricting the analysis to primigravidae. However, only one study recruited women from an antenatal clinic, and small-scale trends could not be assessed due to pooling data obtained from different administrative levels. Studies using routine ANC data in Tanzania did not provide information on gravidity and were unable to reproduce a similar linear effect as the meta-analysis when comparing young women to children [9][10][11] . Another study analysed data from health centres in con ict settings in the Democratic Republic of Congo and found a strong but non-linear relationship between ANC prevalence and incidence in children 12 . However, all the studies used low-sensitivity microscopy or rapid diagnostic tests (RDT), missing the signi cant proportion of P. falciparum infections with parasite densities below the detection threshold of conventional eld diagnostic tools 23 . Moreover, the effect of gravidity and other factors such as HIV was not consistently assessed. Also, spatial patterns in malaria burden have not been compared between both groups due to lack of geospatial data. Finally, none of the studies quanti ed correlation in low-transmission settings, where ANC-based surveillance might be particularly attractive due to local clustering of malaria cases which are di cult to monitor with traditional strategies 2 . Therefore, a better understanding of the validity of ANC prevalence data for monitoring transmission in the community and the factors that affect this relationship remains to be developed.
New surveillance tools, such as antibodies against the pregnancy-speci c antigen VAR2CSA that mediates parasite sequestration in the placenta [13][14][15] , can potentially increase sensitivity to detect recent exposure in low transmission settings where detecting active infections is di cult 15,16 . Combined with novel clustering approaches, this ANC data can increase the resolution to detect spatial patterns 3 , therefore allowing a cost-effective approach for targeting interventions to the most affected areas. In this study, we estimate and compare malaria burden at rst ANC visits with data from cross-sectional surveys and clinical cases in three settings from southern Mozambique with different transmission levels. We correlate temporal and spatial trends at both RDT-and qPCR-detection levels, and characterise the effect of HIV and gravidity on the correlation. Finally, we assess the added value of antibody data obtained from a bead-based multiplex immunoassay against VAR2CSA and general malaria antigens, and a newly developed hotspot detection algorithm, as novel tools to improve surveillance in malaria endemic areas.

Study area and population
The study was conducted between November 2016 and November 2019 in Manhiça and Magude districts in Maputo Province, southern Mozambique. Malaria transmission is low in Manhiça district 17 , with some moderate-to-high transmission areas, such as Ilha Josina 18 . Magude district is a lowtransmission area resulting from elimination interventions since 2015 19 . Data was obtained from 6,471 pregnant women (Supplementary gure S1), residing in the study area, who attended their rst ANC visit at Manhiça District Hospital, Ilha Josina Health Centre, or Magude Health Centre, as previously described 20 . Weekly numbers of RDT-positive clinical malaria cases among children < 5 years old attending the three health facilities (n = 15,467) were obtained from the District Health Information System 2 (DHIS2). In Manhiça district, 37,131 RDT and microscopy results from children < 5 years attending health facilities were available from the paediatric outpatient morbidity surveillance system (OPMSS). Data from 9,362 children aged 2-10 years was collected in age-strati ed cross-sectional surveys conducted every May from 2015 to 2019. Geo-localization of pregnant women and children was obtained from a local health and demographic surveillance system using their permanent or family identi cation number 21,22 , from their household identi cation number or by registering the geolocalization of the households (Supplementary methods).

Parasitological And Immunological Determinations
Finger prick blood drops were collected onto Whatman 903 lter paper (dried blood spots [DBS]) from pregnant women and from children in the cross-sectional surveys. Children were also tested by RDT (HRP2-based SD Bioline Ag Pf, Standard Diagnostics, South Korea). P. falciparum infection was detected and quanti ed in duplicate from DBS with a qPCR targeting the 18S rRNA gene on an ABI PRISM 7500 HT Real-Time System (Applied Biosystems) 23 . Immunoglobulin Gs (IgG) were detected and quanti ed in a multiplexed bead array using Luminex xMAP© technology (Luminex Corp., Austin TX), as described previously 20 . In brief, magnetic beads were coupled to our panel (Supplementary table S1  respectively. Infections with densities above 100 parasites/µL were de ned as RDT-detectable 24 . Primigravidity was de ned as a rst pregnancy, and multigravidity as having had one or more previous pregnancies. HIV status of pregnant women was determined from the maternal health card, or if not available, with an HIV serological rapid test 20 . The threshold of seropositivity against P. falciparum antigens was de ned as the geometric mean plus 2 standard deviations of the rst component from twocomponent normal mixture distributions of mean uorescent intensity values (R package mixtools).
The relationship between P. falciparum parasite rates using RDT (PfPR RDT ) or qPCR (PfPR qPCR ) detection limits in pregnant women and children was analysed using linear regressions, Pearson correlation coe cients (PCC), and statistics (Supplementary methods). Similarly, anti-P. falciparum seroprevalence in pregnant women was compared with PfPR RDT and PfPR qPCR in children. Consistency and correlation between temporal variations in both populations were quanti ed using statistics and PCC, and time lags between the data sources were de ned by maximising PCC (Supplementary methods). 2-point correlation functions (2PCF) was used for clustering analysis, which describe the excess of pairs of P. falciparum positive samples with respect to random infections as a function of the geographical distance between them: where is the 2PCF at distance , is the number of pairs of positive cases between populations 1 and 2 separated a distance between them, and is the number of background pairs (positive or negative) between both populations at distance . and were normalised by and respectively, where and are the number of positive samples and the total number of samples respectively for population 1,2. The 2PCF measurements were done using 10 bins in distance in a range from 0 to 60 km. The choosing of these bins was based on the spatial range of pairwise distances and nding the balance between spatial granularity and the statistical power (sample size) of the measurements. The agreement between different 2PCFs was quanti ed with statistics.
A novel malaria hotspot detector, Epidemiological Foci Relating Infections by Distance (EpiFRIenDs), was developed to detect areas with higher levels of P. falciparum infections (hotspots) and seropositivity against P. falciparum antigens (seroclusters) than statistically expected in a stable period of time 2 .
EpiFRIenDs was designed to detect structures of arbitrary shapes and sizes that account for the background population distribution, a difference with the most commonly used scan statistics based on the SaTScan software 25,26,27 that detect structures of prede ned shapes. EpiFRIenDs detects hotspots and seroclusters by linking positive cases when they are closer than a given pre-de ned distance and indirectly link them to all the positive cases that are close to their connections. The negative cases are then included in the hotspots from their close positive cases. The EpiFRIenDs software is publicly available in Python 28 and R 29 and it is described in detail in the section EpiFRIenDs of Supplementary material. The hotspot detection using EpiFRIenDs and SaTScan (scan statistics) was rst compared in simulated data with a prevalence of 20% reproducing three different scenarios: rst, a random spatial distribution of positive and negative cases; second, four circular clusters of positive cases on top of a background random distribution of negative cases; and third, a sinusoidal distribution of positive cases on top of a background random distribution of negative cases. EpiFRIenDs was then applied to identify hotspots and serological clusters from ANC, OPMSS and antibody data (Supplementary methods).
Hotspots were compared between ANC and clinical data from the Manhiça District Hospital and the Ilha Josina Health Centre, for which geolocation information was available. Since EpiFRIenDs is a densitybased clustering algorithm, clinical data was randomly sub-sampled to obtain the same sample size of positive cases than for ANC data, with ve hundred different random sub-samples to obtain statistical signi cance. Linking distance (1km), temporal windows (one month or one year) and hotspot size thresholds (three or ve positive cases per hotspot) were de ned based on sample density and minimum false detections estimations from 500 realisations with random infections (Supplementary methods). A hotspot from ANC data was considered to be matched by a hotspot from clinical data (or vice versa) if at least one member of an ANC hotspot was found closer than 2km to a member of a hotspot from clinical data.
All analyses were strati ed by gravidity and HIV status, the main factors shown to affect PfPR qPCR in our study population (Supplementary table S2), with primigravid HIV-negative women considered separately to take potential correlations between gravidity and HIV into account 32  . At RDT-detection levels, parasite rates in pregnant women and children showed lower correlations and weaker linear relationships (Fig. 1D-F, Supplementary table S3). Only primigravidae showed a 1-to-1 linear relationship with children (Fig. 1E), whereas multigravidae showed lower rates that were not correlated (PCC = 0.61 [95% CI -0.12-0.94]), with a linear regression slope not consistent with equality (0.17 [95% CI -0.035-0.49]; Fig. 1K). However, in low-transmission Magude and Manhiça, good consistency of both PfPR qPCR and PfPR RDT between pregnant women and children was observed (χ²<1.10), regardless of gravidity or HIV (Fig. 1).

P. falciparum hotspots
The performance of EpiFRIenDs and SaTScan to detect clusters on simulated data was compared for three different scenarios (Supplementary material section EpiFRIenDs). In the rst scenario with a random spatial distribution of positive and negative cases, the only difference between the methods was observed on the detection of false small clusters in small-scale parameterisations of EpiFRIenDs, which can be corrected after a parameter calibration (Methods, Supplementary methods). In the second scenario, positive cases were correctly identi ed as part of the four circular clusters in both methods. And in the third scenario, EpiFriends could correctly assign all positive cases as part of the sinusoidal cluster which could not be detected with SaTScan.
Due to its higher capacity to detect clusters of arbitrary shape, further analysis was conducted using EpiFRIenDs. With this spatial algorithm, 10 hotspots, all in Ilha Josina, were detected with qPCR results from all geolocalized pregnant women (n = 3,616; Fig. 3). Four of them, and the two most persistent ones, occurred during the rst year (Fig. 3 Fig. 4A,B). Correlations remained high across groups of gravidity and HIV status, however, wide con dence intervals increased to include potentially no correlation for some antibodies (Fig. 4A). Signi cant declines in seroprevalence across all areas were only observed for VAR2CSA DBL3 − 4 (Fig. 4B

Discussion
This population-based spatio-temporal analysis of parasitological and serological data from southern Mozambique shows that qPCR-positivity rates at rst ANC visit re ect rates in children with a time lag of 2-3 months relative to clinical cases, regardless of the women's gravidity, HIV status, and the transmission intensity in their area. Disparities emerge at RDT-detection levels for multigravid women in moderate-to-high transmission settings, indicating the need to consider gravidity in the analysis when using diagnostic tools of limited sensitivity. However, gravidity did not affect the number of hotspots detected which were similar to those detected using passive surveillance data. Finally, VAR2CSA seroprevalence at rst ANC visit were found to be highly correlated with parasite rates in children, sensitive to temporal and spatial trends that were missed by RDT data in children, and unaffected by gravidity, therefore constituting a robust adjunct for ANC surveillance. Overall, this data provides evidence for the potential value of pregnant women for programmatic surveillance in malaria endemic regions in sub-Saharan Africa.
Our study, for the rst time, provides evidence that parasite rates in pregnant women are highly correlated and consistent with rates in children from cross-sectional surveys at qPCR-detection levels, across the transmission spectrum, and regardless of HIV status and gravidity. The lower RDT-based rates in multigravid women compared with children in high-to-moderate transmission Ilha Josina, in line with previous reports 8, 10,11 , is probably explained by immunity acquired during successive pregnancies, which enable the control of parasite densities below the RDT-detection limit 30,31 . However, temporal trends in ANC-based parasite rates, both detected by qPCR and RDT, re ected trends in clinical cases observed 2-3 months earlier, similar to the 3-month time lag observed in the Democratic Republic of Congo 12 . This lag suggests that infections at ANC are older than those in symptomatic children, in accordance with previous studies showing that infections in pregnancy mainly result from a boosting of infections acquired before pregnancy 32 . RDT rates in multigravid women showed shorter time lags, possibly due to faster clearance of infections by anti-parasite immunity acquired during previous pregnancies. Despite the time lag, ANC data would still be useful to benchmark passive surveillance estimates and populationbased cross-sectional surveys, improving population denominators and informing about the burden of asymptomatic infections.
Similar spatial patterns of P. falciparum cases and hotspots were observed among pregnant women at ANC and children in the community, regardless of the women's gravidity or HIV status. Spatial clustering decreased from year 1 to year 2, re ecting trends in burden. The novel software EpiFRIenDs revealed ner spatial structures of P. falciparum infections and detected several hotspots from ANC and clinical case data. ANC data detected 80% of the clinical case hotspots. At RDT-detection levels, fewer hotspots were identi ed from HIV-positive and multigravidae than when including all women, probably due to lower sample size and positivity rates. Differences in the temporal distribution of hotspots were observed between ANC and clinical case data, which might result from the time lag observed, different denominator populations, inclusion of asymptomatic cases in ANC data, and variations in care-seeking behaviour. The sample size of clinical cases was limited in order to compare it with ANC data, although the larger sample size of clinical data would probably improve the precision of hotspot detection. However, passive surveillance systems do not usually record the geolocation of children routinely, precluding spatial analysis. Further studies are required to assess the value of ANC data for identifying pockets of transmission missed by case-based surveillance for supplementing reactive strategies.
Among the 14 antigens evaluated in this study, DBL3-4, derived from the pregnancy-speci c antigen VAR2CSA 33 , was found to be the most promising marker for ANC-based sero-surveillance. DBL3-4 seroprevalence at rst ANC visit showed high correlation with qPCR rates in children, across all gravidity and HIV-status groups. Temporal trends in DBL3-4 seroprevalence also showed the highest correlation with temporal trends in clinical cases. Furthermore, DBL3-4 was the only serological marker that mirrored declines of qPCR parasite rates in all three areas. DBL3-4 was able to detect declines missed by RDT, demonstrating the power of this serological marker in settings with few RDT-detectable cases. Along with several other antibodies, DBL3-4 also showed great potential to detect spatial patterns in transmission. In year 2, more sero-clusters were identi ed than qPCR hotspots. In particular, sero-clusters were found to persist in a low-transmission area after elimination interventions had been deployed and very few P. falciparum cases were detected. This shows the ability of serological markers to capture recent transmission dynamics, which would be especially useful to demonstrate continuous absence of transmission following elimination. A rapid serological test for antibodies against DBL3-4 at ANC could represent a low-cost surveillance tool with improved ability to detect trends missed by RDT, or to detect recent cases in elimination settings. Importantly, even if VAR2CSA-based vaccines under development are widely administered in the near future, DBL3-4 can still be used for sero-surveillance as it is not a vaccine target 34 .
This study has several limitations. First, data collection might have been affected by RDT stock outs, changes in reporting practices 21 and human errors 19 , and geolocalisation of clinical cases was only collected in Manhiça district. Second, sample sizes differ substantially between clinical, ANC and crosssectional data, limiting the power of comparisons, especially for spatial analysis. Third, women not attending ANC tend to be older, live in rural settings, and be of lower socio-economic status than attending women, which are all risk factors for malaria 3 . However, this selection bias is likely to be low in sub-Saharan Africa due to high ANC attendance 6 . Also, other factors not assessed in this study might affect the relationship between malaria in children and in pregnant women, such as other co-infections or age. Finally, single-time-point measurements may limit the ability to infer true infection and serological status due to the complex dynamics of parasites and antibody responses in the infected host 33,35 .
Conducting similar studies in different epidemiological settings and varying ANC attendance levels can provide useful information to con rm the generalizability of this novel surveillance approach.
In conclusion, malaria testing of pregnant women at their rst ANC visit can provide estimates of temporal and spatial trends in malaria burden that re ect those observed in children. However, the time lag of 2-3 months relative to clinical cases, together with gravidity and diagnostic test sensitivity in high transmission settings, need to be considered when interpreting ANC data. The bias introduced by infections missed by RDT in multigravid women at moderate-to-high transmission levels can be avoided by restricting the analysis to primigravidae, using highly sensitive detection tools, such as qPCR or VAR2CSA serology, or with models accounting for immunity. ANC data can also be used to detect malaria hotspots re ecting those detected with passive surveillance data, potentially providing a cost-e cient approach to tailor interventions to areas most in need. The new spatial algorithm developed in this study, EpiFRIenDs, showed its superiority compared to scan statistics using SaTScan 25,26,27 in detecting irregular hotspots, which better re ect the spatial distribution of human populations. However, a previous calibration of EpiFRIenDs is required to minimize the detection of false small hotspots, which will be automatically performed in future versions of the software. This algorithm constitutes a new tool to link foci detection with appropriate targeted approaches. Finally, antibodies against the pregnancy-speci c parasite antigen VAR2CSA could serve as a more resilient marker of spatio-temporal malaria trends measured at ANC. Taken together with other potential bene ts of a continuous ANC-based surveillance approach, including more precise denominator populations, and the ability to capture asymptomatic infections, surveilling pregnant women at rst ANC visit has great potential to complement existing surveillance systems in Africa.   for year 3) and from children from visits to health facilities (G for year 1, H for year 2 and I for year 3) in Manhiça district (colour coded by their visit date). Cases circled in red belong to hotspots detected using temporal windows of one month. J: Temporal distribution of number of hotspots detected from ANC data (orange) and from clinical data (blue). K: Histogram of lifetimes of identi ed hotspots from ANC data (orange) and from clinical data (blue). L,M: Timeline of identi ed hotspots with their size (y-axis) and

Declarations
colour coded by their timeline from ANC data (L) and clinical malaria cases data (M).

Figure 4
Comparison of spatial and temporal trends in seropositive women at rst antenatal care visit and Plasmodium falciparum infection in children.
A: Pearson correlation coe cients between PfPR qPCR in children 2-10 years old from cross-sectional surveys and seroprevalence in pregnant women at rst ANC visits. Results are shown for different gravidity and HIV status groups from pregnant women (from left to right), and each colour represents a